IMAGE ALT TEXT HERE


austechia is a Jupyter notebook that provides some example code of how to plot trees imported with baltic.


Copyright 2016 Gytis Dudas. Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The classes of baltic


There are 3 main classes in baltic:

The node and leaf classes are similar and share many parameters in common, such as branch length, height, position in absolute time, traits, parents, x and y coordinates and index of the character which defined them in the tree string. They differ in that the node class contains a list of its children objects for tree traversals, called children and a list of all tips that descend from it eventually, called leaves. Node class also contains parameters like childHeight and numChildren which are less important. The leaf class contains two name variables: numName and name. numName corresponds to whatever was used to designate the tip in the tree string. name can be set later and is meant to handle scenarios where the tree string contains tip names encoded as numbers (Nexus format), which can be decoded into name, rather than overwriting whatever was caught in numName.

The tree class wraps leaf and node classes together by performing operations to build, manipulate, visualise and analyse the full tree data structure. The recommended way of interacting with the tree is via the Objects list, which is a flat list of all branches in the tree.

baltic now also has a clade class. They are introduced when a subtree is collapsed and pose as tips for tree traversals.

The function


At the top is the make_tree function. Given a valid tree string it will transform the information contained in the tree string into an actual tree data structure. If there are elements within the tree string that it can't parse it will warn the user of this. The most common reasons for this are unexpected characters in branch labels or something unexpected in tip names (particularly illegal characters: whitespace, parentheses, commas, semicolons, etc).

Tree import


baltic was primarily written to handle FigTree files with rich branch annotations. These nexus files can be loaded via the loadNexus() function, or if you have a newick tree you can use loadNewick. loadNexus accepts regular expression to find dates for tips if they are encoded in the names, find tree strings (if using nexus files not generated in BEAST).

Plotting trees


As long as the drawTree() function has been called from the tree object it's possible to draw out the tree by iterating over every branch in the tree object and plotting it. It remains up to the user as to how the tree gets plotted in terms of colour, branch width, tip sizes, tip labels, etc.

Collapsing branches


Branches can be collapsed based on trait values or attribute values that don't satisfy a function. The returned tree object then needs to be redrawn using the drawTree() method to get new y positions for branches.

Extract trait subtrees, fix subtrees


The following cell showcases code to decompose trees with trait labels into individual subtrees, which are recovered from within-trait tree traversals. Occasionally it will yield subtrees with only one valid child (i.e. a single tip) and multitype trees.

Tree spectrum


This next bit plots every extracted subtree onto a single plot.

Reduced trees


This bit shows how to recover and plot reduced trees, which have a subset of the tips from the full tree, but preserve the evolutionary history that left over tips descend from.

Turning a tree data structure into a string

baltic can convert the data structure contained within the tree class to a tree string, with options to include the numName (the name of the tip as it appeared in the tree string) or the name attribute (if tips were decoded from their BEAST encoding) for tips and the inclusion of BEAST annotations.

Tree transformations


This next bit plots the tree in terms of the trait transitions that take place within it. Trait space is where things get plotted, but other dimensions, such as time, can be represented with colour or line width, as desired. I've also added code to do Bezier curves, which I think are an amazing tool in a scientist's arsenal, mostly because they can be customised to prevent overlapping with each other and to highlight links between closely positioned points.

Tangled chains (sequential tanglegrams)


The following code imports a bunch of trees, collapses nodes, plots them end to end coloured by the trait value of the first tree and connects the same tips by lines that follow the order of tips in the first tree.

Each neighbouring tree first needs to be iteratively untangled as much as possible.

Radial trees and shutter plots


The cell below shows code that can be used to plot radial trees, in addition to code that could be used to plot a series of trees in a circle facing inwards with a particular isolate highlighted in all plotted phylogenies. This was a suggestion proposed by Anne-Mieke Vandamme at the amazing Virus Genomics and Evolution (#VGE16) meeting in Cambridge in June 2016.

This cell re-orients the tree such that time now follows the circle.

Multitype trees

baltic now has the ability to deal with multitype trees recovered as part of structured coalescent analyses, which contain nodes with a single child. You can find an example of the files you might find after running a structured coalescent analysis in beast2 here.

Collapsing clades

baltic allows subtrees to be collapsed. When given a node object the collapseSubtree function will replace that node and any of its descendants with a clade object. These pose as tips, but contain attributes that allow the clade object to be plotted in a way that can represent how many tips were present in the collapsed clade or when the most recent tip of the collapsed subtree existed.

Unrooted trees can now be drawn too.

Clade frequencies


Richard Neher (at University of Basel) has written a script to calculate clade frequencies (smoothed nested frequency trajectories of tips) over time which is part of nextstrain's augur module. The next few cells show how clade frequencies can be incorporated into baltic via Biopython's phylogenetics parts and plotted.

Reticulate trees


Nicola Müller (formerly at ETH Zurich in Basel, now at Fred Hutch), Ugnė Stolz, Tim Vaughan, and Tanja Stadler (all based at ETH Zurich in Basel) have implemented a reassortment algorithm in BEAST2 which reconstructs clonal trees with reticulate edges. The next cell shows how to plot a summarised tree from these types of analyses in baltic.